Methodology for the evaluation of the algorithms for text segmentation based on errors type
نویسنده
چکیده
Text segmentation represents the key element in the optical character recognition process. Hence, testing procedure for text segmentation algorithms has significance importance. All previous works deal mainly with text database as a template. They are used for testing as well as for the evaluation of the text segmentation algorithm. However, because of inconsistencies in this process, some methodology for the experiments is required. In this manuscript, methodology for the evaluation of the algorithm for text segmentation based on errors type is proposed. It is established on the various multiline text samples linked with text segmentation. Final result is obtained by comparative analysis of cross linked data. At the end, its suitability for different type of scripts represents its main advantage. Streszczenie. Segmentacja tekstu stanowi kluczowy element procesu optycznego rozpoznawania znaków. Wszystkie dotychczasowe prace dotyczą głównie bazy danych tekstu jako szablonu. Są one używane do testowania, jak i dla oceny algorytmu segmentacji tekstu. Jednak w taki, algorytmie występują nieścisłości. W pracy przedstawiono , metodologię oceny algorytmu segmentacji tekstu w oparciu o typ błędów. Badania przeprowadzono na różnych próbkach tekstu wielowierszowego. Końcowy wynik uzyskuje się poprzez analizę porównawczą danych. (Metodologia oceny algorytmów segmentacji tekstu w oparciu o błędy typu).
منابع مشابه
Design, Development and Evaluation of an Orange Sorter Based on Machine Vision and Artificial Neural Network Techniques
ABSTRACT- The high production of orange fruit in Iran calls for quality sorting of this product as a requirement for entering global markets. This study was devoted to the development of an automatic fruit sorter based on size. The hardware consisted of two units. An image acquisition apparatus equipped with a camera, a robotic arm and controller circuits. The second unit consisted of a robotic...
متن کاملA Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling
In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...
متن کاملA Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling
In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...
متن کاملارتقای کیفیت دستهبندی متون با استفاده از کمیته دستهبند دو سطحی
Nowadays, the automated text classification has witnessed special importance due to the increasing availability of documents in digital form and ensuing need to organize them. Although this problem is in the Information Retrieval (IR) field, the dominant approach is based on machine learning techniques. Approaches based on classifier committees have shown a better performance than the others. I...
متن کاملComparison of Two Goal-Oriented Methods for the Evaluation of the Text-Line Segmentation Algorithms
Text line segmentation process represents the key step in the optical character recognition. Hence, the efficiency evaluation procedure for text line segmentation algorithms is the challenge. Text line segmentation process is established by the algorithms application to the text dataset. Furthermore, two goal-oriented methods for the evaluation of the text line segmentation results based on ext...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011